36 research outputs found
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster
In this work, we propose FastCoT, a model-agnostic framework based on
parallel decoding without any further training of an auxiliary model or
modification to the LLM itself. FastCoT uses a size-varying context window
whose size changes with position to conduct parallel decoding and
auto-regressive decoding simultaneously, thus fully utilizing GPU computation
resources. In FastCoT, the parallel decoding part provides the LLM with a quick
glance of the future composed of approximate tokens, which could lead to faster
answers compared to regular autoregressive decoding used by causal
transformers. We also provide an implementation of parallel decoding within
LLM, which supports KV-cache generation and batch processing. Through extensive
experiments, we demonstrate that FastCoT saves inference time by nearly 20%
with only a negligible performance drop compared to the regular approach.
Additionally, we show that the context window size exhibits considerable
robustness for different tasks
LinkLouvain: Link-Aware A/B Testing and Its Application on Online Marketing Campaign
A lot of online marketing campaigns aim to promote user interaction. The
average treatment effect (ATE) of campaign strategies need to be monitored
throughout the campaign. A/B testing is usually conducted for such needs,
whereas the existence of user interaction can introduce interference to normal
A/B testing. With the help of link prediction, we design a network A/B testing
method LinkLouvain to minimize graph interference and it gives an accurate and
sound estimate of the campaign's ATE. In this paper, we analyze the network A/B
testing problem under a real-world online marketing campaign, describe our
proposed LinkLouvain method, and evaluate it on real-world data. Our method
achieves significant performance compared with others and is deployed in the
online marketing campaign.Comment: Accepted by the Industrial & Practitioner Track of the 26th
International Conference on Database Systems for Advanced Applications
(DASFAA 2021
Who Would be Interested in Services? An Entity Graph Learning System for User Targeting
With the growing popularity of various mobile devices, user targeting has
received a growing amount of attention, which aims at effectively and
efficiently locating target users that are interested in specific services.
Most pioneering works for user targeting tasks commonly perform
similarity-based expansion with a few active users as seeds, suffering from the
following major issues: the unavailability of seed users for newcoming services
and the unfriendliness of black-box procedures towards marketers. In this
paper, we design an Entity Graph Learning (EGL) system to provide explainable
user targeting ability meanwhile applicable to addressing the cold-start issue.
EGL System follows the hybrid online-offline architecture to satisfy the
requirements of scalability and timeliness. Specifically, in the offline stage,
the system focuses on the heavyweight entity graph construction and user entity
preference learning, in which we propose a Three-stage Relation Mining
Procedure (TRMP), breaking loose from the expensive seed users. At the online
stage, the system offers the ability of user targeting in real-time based on
the entity graph from the offline stage. Since the user targeting process is
based on graph reasoning, the whole process is transparent and
operation-friendly to marketers. Finally, extensive offline experiments and
online A/B testing demonstrate the superior performance of the proposed EGL
System.Comment: Accepted by ICDE 202
Think-in-Memory: Recalling and Post-thinking Enable LLMs with Long-Term Memory
Memory-augmented Large Language Models (LLMs) have demonstrated remarkable
performance in long-term human-machine interactions, which basically relies on
iterative recalling and reasoning of history to generate high-quality
responses. However, such repeated recall-reason steps easily produce biased
thoughts, \textit{i.e.}, inconsistent reasoning results when recalling the same
history for different questions. On the contrary, humans can keep thoughts in
the memory and recall them without repeated reasoning. Motivated by this human
capability, we propose a novel memory mechanism called TiM (Think-in-Memory)
that enables LLMs to maintain an evolved memory for storing historical thoughts
along the conversation stream. The TiM framework consists of two crucial
stages: (1) before generating a response, a LLM agent recalls relevant thoughts
from memory, and (2) after generating a response, the LLM agent post-thinks and
incorporates both historical and new thoughts to update the memory. Thus, TiM
can eliminate the issue of repeated reasoning by saving the post-thinking
thoughts as the history. Besides, we formulate the basic principles to organize
the thoughts in memory based on the well-established operations,
(\textit{i.e.}, insert, forget, and merge operations), allowing for dynamic
updates and evolution of the thoughts. Furthermore, we introduce
Locality-Sensitive Hashing into TiM to achieve efficient retrieval for the
long-term conversations. We conduct qualitative and quantitative experiments on
real-world and simulated dialogues covering a wide range of topics,
demonstrating that equipping existing LLMs with TiM significantly enhances
their performance in generating responses for long-term interactions
Adaptive Optimizers with Sparse Group Lasso for Neural Networks in CTR Prediction
We develop a novel framework that adds the regularizers of the sparse group
lasso to a family of adaptive optimizers in deep learning, such as Momentum,
Adagrad, Adam, AMSGrad, AdaHessian, and create a new class of optimizers, which
are named Group Momentum, Group Adagrad, Group Adam, Group AMSGrad and Group
AdaHessian, etc., accordingly. We establish theoretically proven convergence
guarantees in the stochastic convex settings, based on primal-dual methods. We
evaluate the regularized effect of our new optimizers on three large-scale
real-world ad click datasets with state-of-the-art deep learning models. The
experimental results reveal that compared with the original optimizers with the
post-processing procedure which uses the magnitude pruning method, the
performance of the models can be significantly improved on the same sparsity
level. Furthermore, in comparison to the cases without magnitude pruning, our
methods can achieve extremely high sparsity with significantly better or highly
competitive performance. The code is available at
https://github.com/intelligent-machine-learning/dlrover/blob/master/tfplus.Comment: 24 pages. Published as a conference paper at ECML PKDD 2021. This
version includes Appendix which was not included in the published version
because of page limi
Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning
We study the budget allocation problem in online marketing campaigns that
utilize previously collected offline data. We first discuss the long-term
effect of optimizing marketing budget allocation decisions in the offline
setting. To overcome the challenge, we propose a novel game-theoretic offline
value-based reinforcement learning method using mixed policies. The proposed
method reduces the need to store infinitely many policies in previous methods
to only constantly many policies, which achieves nearly optimal policy
efficiency, making it practical and favorable for industrial usage. We further
show that this method is guaranteed to converge to the optimal policy, which
cannot be achieved by previous value-based reinforcement learning methods for
marketing budget allocation. Our experiments on a large-scale marketing
campaign with tens-of-millions users and more than one billion budget verify
the theoretical results and show that the proposed method outperforms various
baseline methods. The proposed method has been successfully deployed to serve
all the traffic of this marketing campaign.Comment: WSDM 23, Best Paper Candidat
Leveraging Large Language Models for Pre-trained Recommender Systems
Recent advancements in recommendation systems have shifted towards more
comprehensive and personalized recommendations by utilizing large language
models (LLM). However, effectively integrating LLM's commonsense knowledge and
reasoning abilities into recommendation systems remains a challenging problem.
In this paper, we propose RecSysLLM, a novel pre-trained recommendation model
based on LLMs. RecSysLLM retains LLM reasoning and knowledge while integrating
recommendation domain knowledge through unique designs of data, training, and
inference. This allows RecSysLLM to leverage LLMs' capabilities for
recommendation tasks in an efficient, unified framework. We demonstrate the
effectiveness of RecSysLLM on benchmarks and real-world scenarios. RecSysLLM
provides a promising approach to developing unified recommendation systems by
fully exploiting the power of pre-trained language models.Comment: 13 pages, 4 figure
The pyruvate decarboxylase activity of IpdC is a limitation for isobutanol production by Klebsiella pneumoniae
BACKGROUND: Klebsiella pneumoniae contains an endogenous isobutanol synthesis pathway. The ipdC gene annotated as an indole-3-pyruvate decarboxylase (Kp-IpdC), was identified to catalyze the formation of isobutyraldehyde from 2-ketoisovalerate. RESULTS: Compared with 2-ketoisovalerate decarboxylase from Lactococcus lactis (KivD), a decarboxylase commonly used in artificial isobutanol synthesis pathways, Kp-IpdC has an 2.8-fold lower Km for 2-ketoisovalerate, leading to higher isobutanol production without induction. However, expression of ipdC by IPTG induction resulted in a low isobutanol titer. In vitro enzymatic reactions showed that Kp-IpdC exhibits promiscuous pyruvate decarboxylase activity, which adversely consume the available pyruvate precursor for isobutanol synthesis. To address this, we have engineered Kp-IpdC to reduce pyruvate decarboxylase activity. From computational modeling, we identified 10 amino acid residues surrounding the active site for mutagenesis. Ten designs consisting of eight single-point mutants and two double-point mutants were selected for exploration. Mutants L546W and T290L that showed only 5.1% and 22.1% of catalytic efficiency on pyruvate compared to Kp-IpdC, were then expressed in K. pneumoniae for in vivo testing. Isobutanol production by K. pneumoniae T290L was 25% higher than that of the control strain, and a final titer of 5.5 g/L isobutanol was obtained with a substrate conversion ratio of 0.16 mol/mol glucose. CONCLUSIONS: This research provides a new way to improve the efficiency of the biological route of isobutanol production
Enhancing Recommender Systems with Large Language Model Reasoning Graphs
Recommendation systems aim to provide users with relevant suggestions, but
often lack interpretability and fail to capture higher-level semantic
relationships between user behaviors and profiles. In this paper, we propose a
novel approach that leverages large language models (LLMs) to construct
personalized reasoning graphs. These graphs link a user's profile and
behavioral sequences through causal and logical inferences, representing the
user's interests in an interpretable way. Our approach, LLM reasoning graphs
(LLMRG), has four components: chained graph reasoning, divergent extension,
self-verification and scoring, and knowledge base self-improvement. The
resulting reasoning graph is encoded using graph neural networks, which serves
as additional input to improve conventional recommender systems, without
requiring extra user or item information. Our approach demonstrates how LLMs
can enable more logical and interpretable recommender systems through
personalized reasoning graphs. LLMRG allows recommendations to benefit from
both engineered recommendation systems and LLM-derived reasoning graphs. We
demonstrate the effectiveness of LLMRG on benchmarks and real-world scenarios
in enhancing base recommendation models.Comment: 12 pages, 6 figure